A Robust Finite-state Parser for French

نویسنده

  • Jean-Pierre Chanod
چکیده

This paper describes a robust nite-state parser implemented for French. The parser attaches morpho-syntactic tags to each word and determines clause boundaries. It is a reductionist parser based on nite-state networks and their intersection. We describe essential elements of the rule writing system, and show how it is actually applied to solve various phenomena, such as argument uniqueness, agreement or apposition. We show some results which indicate that the parser can parse technical manuals with high accuracy (in a test sample 95 % of part-of-speech and functional tags were correct). The average number of parses per sentence is very low, more than 92 % of sentences produce less than 4 parses, including the correct one. A test on very long sentences from newspaper corpora and a discussion of errors provide more insight into the parser.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rules and Constraints in a French Finite-State Grammar

This report describes the rule system of a robust nite-state parser implemented for French. The parser attaches syntactic tags to each word as well as part-of-speech and morphological tags, and determines clause boundaries. It is a reductionist parser i.e. it removes readings from the originally ambiguous text. The underlying parser is based on nite-state networks and their intersection. We des...

متن کامل

A Language-Independent Shallow-Parser Compiler

We present a rule−based shallow− parser compiler, which allows to generate a robust shallow−parser for any language, even in the absence of training data, by resorting to a very limited number of rules which aim at identifying constituent boundaries. We contrast our approach to other approaches used for shallow−parsing (i.e. finite−state and probabilistic methods). We present an evaluation of o...

متن کامل

PROFER: predictive, robust finite-state parsing for spoken language

The natural languageprocessingcomponentof a speechunderstanding system is commonly a robust, semantic parser, implemented as either a chart-based transition network, or as a generalized leftright (GLR) parser. In contrast, we are developing a robust, semantic parser that is a single, predictive finite-state machine. Our approach is motivated by our belief that such a finite-state parser can ult...

متن کامل

Comparative Study of GLR Parser with Finite-state Predictors and Chart-based Semantic Parsers

The natural language processing component of a speech understanding system is commonly a robust, semantic parser, implemented as either a chart-based transition network, or as a generalized left right (GLR) parser. In contrast, we are developing a robust, semantic parser that is a single, predictive finite-state machine. Our approach is motivated by our belief that such a finite-state parser ca...

متن کامل

A Robust Parser for Unrestricted Greek Text

In this paper we describe a method for the efficient parsing of real-life Greek texts at the surface syntactic level. A grammar consisting of non-recursive regular expressions describing Greek phrase structure has been compiled into a cascade of finite state transducers used to recognize syntactic constituents. The implemented parser lends itself to applications where large scale text processin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997